115 research outputs found

    Using citation-context to reduce topic drifting on pure citation-based recommendation

    Get PDF
    Recent works in the area of academic recommender systems have demonstrated the effectiveness of co-citation and citation closeness in related-document recommendations. However, documents recommended from such systems may drift away from the main theme of the query document. In this work, we investigate whether incorporating the textual information in close proximity to a citation as well as the citation position could reduce such drifting and further increase the performance of the recommender system. To investigate this, we run experiments with several recommendation methods on a newly created and now publicly available dataset containing 53 million unique citation-based records. We then conduct a user-based evaluation with domain-knowledgeable participants. Our results show that a new method based on the combination of Citation Proximity Analysis (CPA), topic modelling and word embeddings achieves more than 20% improvement in Normalised Discounted Cumulative Gain (nDCG) compared to CPA

    Mr. DLib: Recommendations-as-a-Service (RaaS) for Academia

    Full text link
    Only few digital libraries and reference managers offer recommender systems, although such systems could assist users facing information overload. In this paper, we introduce Mr. DLib's recommendations-as-a-service, which allows third parties to easily integrate a recommender system into their products. We explain the recommender approaches implemented in Mr. DLib (content-based filtering among others), and present details on 57 million recommendations, which Mr. DLib delivered to its partner GESIS Sowiport. Finally, we outline our plans for future development, including integration into JabRef, establishing a living lab, and providing personalized recommendations.Comment: Accepted for publication at the JCDL conference 201

    Securing Video Integrity Using Decentralized Trusted Timestamping on the Bitcoin Blockchain

    Get PDF
    The ability to verify the integrity of video files is important for consumer and business applications alike. Especially if video files are to be used as evidence in court, the ability to prove that a file existed in a certain state at a specific time and was not altered since is crucial. This paper proposes the use of blockchain technology to secure and verify the integrity of video files. To demonstrate a specific use case for this concept, we present an application that converts a video camera enabled smartphone into a cost-effective tamperproof dashboard camera (dash cam). If the phone’s built-in sensors detect a collision, the application automatically creates a hash of the relevant video recording. This video file’s hash is immediately transmitted to the OriginStamp service, which includes the hash in a transaction made to the Bitcoin network. Once the Bitcoin network confirms the transaction, the video file’s hash is permanently secured in the tamperproof decentralized public ledger that is the blockchain. Any subsequent attempt to manipulate the video is futile, because the hash of the manipulated footage will not match the hash that was secured in the blockchain. Using this approach, the integrity of video evidence cannot be contested. The footage of dashboard cameras could become a valid form of evidence in court. In the future, the approach could be extended to automatically secure the integrity of digitally recorded data in other scenarios, including: surveillance systems, drone footage, body cameras of law enforcement, log data from industrial machines, measurements recorded by lab equipment, and the activities of weapon systems. We have made the source code of the demonstrated application available under an MIT License and encourage anyone to contribute: www.gipp.com/dt

    Paraphrase Types for Generation and Detection

    Full text link
    Current approaches in paraphrase generation and detection heavily rely on a single general similarity score, ignoring the intricate linguistic properties of language. This paper introduces two new tasks to address this shortcoming by considering paraphrase types - specific linguistic perturbations at particular text positions. We name these tasks Paraphrase Type Generation and Paraphrase Type Detection. Our results suggest that while current techniques perform well in a binary classification scenario, i.e., paraphrased or not, the inclusion of fine-grained paraphrase types poses a significant challenge. While most approaches are good at generating and detecting general semantic similar content, they fail to understand the intrinsic linguistic variables they manipulate. Models trained in generating and identifying paraphrase types also show improvements in tasks without them. In addition, scaling these models further improves their ability to understand paraphrase types. We believe paraphrase types can unlock a new paradigm for developing paraphrase models and solving tasks in the future.Comment: Published at EMNLP 202
    • …
    corecore